Filesystems for Streaming Databases
نویسندگان
چکیده
Workloads for high-performance streaming databases often contain many writes of small data blocks (for example, of metadata) followed by large subrange queries. Most of today’s file systems and databases either cannot provide adequate performance for the write phase, the read phase, or both. The supercomputing technologies group at MIT CSAIL has been investigating cache-aware and cache-oblivious data structures for disk-resident streaming data. We investigated the cache-aware buffered repository tree (BRT). A BRT with a block size of B can theoretically perform a write in time O((logB n)/ √ B), as compared to B-tree’s O(logB N), and can perform reads only a constant factor slower than the B-tree. We implemented a prototype of a cache-aware streaming B-tree. For 1megabyte blocks the streaming B-tree achieves a 230-fold speedup for random insertions, at a cost of slowing down serial insertions by factor of 6. Preliminary measurements on the SSCA#3 IO benchmark show a 1.4-fold speedup using stream B-trees instead of standard B-trees. We have also designed a cache-oblivious data structure called the cache-oblivious lookahead tree, which should be able to achieve similar bounds without requiring us to tune for the
منابع مشابه
Design and Test of the Real-time Text mining dashboard for Twitter
One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...
متن کاملA simple and novel method for acoustic streaming power measurement of ultrasonic horn
Ultrasonic horn with transfer of acoustic wave into an aqueous solution results in unique properties. When, transfer of sound wave into a liquid results in liquid movement in the direction of wave propagation which gradually loses its energy due to the viscous friction. This wave motion induces a flow which is known as acoustic streaming or micro-streaming. In this article, a simple innovative ...
متن کاملDesign for a Tag-Structured Filesystem
Tagging is an organizational system commonly used as an alternative to hierarchical systems. Many authors have recognized the desirability of a filesystem accessed with a tagging interface, as opposed to or in addition to a traditional directory interface. Existing tagging filesystems universally use conventional, hierarchical filesystems as a backing store. This paper examines the challenges i...
متن کاملPrism: Providing Flexible and Fast Filesystem Cloning Service for Virtual Servers
This paper describes a prototype virtualized file system, Prism, for supporting hosted servers and utility computing. Prism provides a filesystem service that allows lightweight creation of filesystems for new users from existing filesystems. All users’ filesystems are mutable and yet isolated from each other. In our experiments, new filesystems can be created from existing ones in under one-fi...
متن کاملHybrid algorithms for Job shop Scheduling Problem with Lot streaming and A Parallel Assembly Stage
In this paper, a Job shop scheduling problem with a parallel assembly stage and Lot Streaming (LS) is considered for the first time in both machining and assembly stages. Lot Streaming technique is a process of splitting jobs into smaller sub-jobs such that successive operations can be overlapped. Hence, to solve job shop scheduling problem with a parallel assembly stage and lot streaming, deci...
متن کامل